A flexible spectral modification method based on temporal decomposition and Gaussian mixture model
نویسندگان
چکیده
This paper presents a new spectral modification method to solve two drawbacks of conventional spectral modification methods, insufficient smoothness of the modified spectra between frames and ineffective spectral modification. To overcome the insufficient smoothness, a speech analysis technique called temporal decomposition (TD) is used to model the spectral evolution. Instead of modifying the speech spectra frame by frame, we only need to modify event targets and event functions, and the smoothness of the modified speech is ensured by the shape of the event functions. To overcome the ineffective spectral modification, we explore Gaussian mixture model (GMM) parameters for an input of TD to model the spectral envelope, and develop a new method of modifying GMM parameters in accordance with formant scaling factors. Experimental results show that the effectiveness of the proposed method is verified in terms of the smoothness of the modified speech and the effective spectral modification.
منابع مشابه
A novel technique for voice conversion based on style and content decomposition with bilinear models
This paper presents a novel technique for voice conversion by solving a two-factor task using bilinear models. The spectral content of the speech represented as line spectral frequencies is separated into so-called style and content parameterizations using a framework proposed in [1]. This formulation of the voice conversion problem in terms of style and content offers a flexible representation...
متن کاملNegative Selection Based Data Classification with Flexible Boundaries
One of the most important artificial immune algorithms is negative selection algorithm, which is an anomaly detection and pattern recognition technique; however, recent research has shown the successful application of this algorithm in data classification. Most of the negative selection methods consider deterministic boundaries to distinguish between self and non-self-spaces. In this paper, two...
متن کاملمقایسه روش های طیفی برای شناسایی زبان گفتاری
Identifying spoken language automatically is to identify a language from the speech signal. Language identification systems can be divided into two categories, spectral-based methods and phonetic-based methods. In the former, short-time characteristics of speech spectrum are extracted as a multi-dimensional vector. The statistical model of these features is then obtained for each language. The ...
متن کاملCombination of temporal domain SVD based speech enhancement and GMM based speech estimation for ASR in noise - evaluation on the AURORA2 task -
In this paper, we propose a noise robust speech recognition method by combination of temporal domain singular value decomposition(SVD) based speech enhancement and Gaussian mixture model(GMM) based speech estimation. The bottleneck of GMM based approach is a noise estimation problem. For this noise estimation problem, we incorporated the adaptive noise estimation in GMM based approach. Furtherm...
متن کاملRecognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model
Speech is one of the most opulent and instant methods to express emotional characteristics of human beings, which conveys the cognitive and semantic concepts among humans. In this study, a statistical-based method for emotional recognition of speech signals is proposed, and a learning approach is introduced, which is based on the statistical model to classify internal feelings of the utterance....
متن کامل